Method and apparatus for watermarking binary computer code with modified compiler optimizations

ABSTRACT

A system and apparatus for inserting a watermark into a compiled computer program selectively replaces specified optimizations by non-optimized code to encode bit values of the watermark. The watermark is read by decoding the executable code and assigning the decoded bit values, determined by the presence or absence of optimized code, to bit positions in a signature.

BACKGROUND OF THE INVENTION

[0001] It can be useful to be able to identify the code produced by different compilers to identify non-licensed uses of the compilers, and to track errors. Accordingly, compiler manufacturers require a method of including a serial number or other identifying mark in code produced by a compiler. Additionally, a method of analyzing a copy of the compiled code to determine the serial number or identifying mark is also required.

[0002] A private watermark, which is data hidden via steganography, is one method for tracking the outputs of licensed programs. However traditional steganography requires the presence of “low order” bits in the data stream. The low order bits can be changed without the data changing so much that a human can notice the difference. The changed bits, detected when the modified field is compared to the original, can hold the steganographic data. Since traditional stenography changes non-significant low-order bits, steganography is normally applied to digital pictures and sounds.

[0003] Steganography in computer code can't be done with the normal methods because computer code does not contain low-order bits. Every bit in the code is important, and flipping even one bit can prevent the code from operating correctly.

[0004] Accordingly, improved techniques for inserting identifying watermarks in compiled programs is needed.

BRIEF SUMMARY OF THE INVENTION

[0005] In one embodiment of the invention, a method for generating and auditing a watermark for a compiled computer program is provided. The watermark is an integral part of the program and does not appear as an external data item.

[0006] In another embodiment, a watermarking module selectively replaces n-optimized code segments with non-optimized code segments. For a current signature digit, the optimized code segment is replaced by a non-optimized code segment only if the signature digit has a first binary value. The presence of the optimized encode segment encodes the second binary value.

[0007] In another embodiment of the invention, a watermarking module searches the executable code for the presence of optimized code for unrolling a loop. If the current signature digit has a first binary value then the optimized code is replaced by non-optimized code to encode the first binary value in the watermark.

[0008] In another embodiment of the invention, watermarked executable code is searched for the presence of optimized and non-optimized code segments. If a non-optimized code segment is detected then a current signature digit is assigned the first binary value. If an optimized code segment is detected the current signature digit is assigned the second binary value.

[0009] Other features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010]FIG. 1 is a block diagram of a computer system 10 configured to implement an embodiment of the invention;

[0011]FIG. 2 is a block diagram depicting the operation of a first embodiment that encodes the watermark as a loop-unrolling non-optimization;

[0012]FIG. 3 is flowchart of the watermark encoding process of a an embodiment of the invention; and

[0013]FIG. 4 is flowchart of the watermark decoding process of a an embodiment of the invention.

DETAILED DESCRIPTION OF THE INVENTION

[0014] The invention will now be described, by way of example not limitation, with reference to various embodiments. FIG. 1 is a block diagram of a computer system 10 configured to implement an embodiment of the invention. The computer system 10 includes a computer 12, an input device 14 such as a keyboard, and output device 16 such as a display screen. The computer 12 includes a main memory 18, which may include RAM and NVRAM, central processing unit (“CPU”) 20, and a secondary memory 22. A compiler 24, a source code module 26, a compiled program 30, and a watermarking module 32 for inserting and retrieving the watermark from the compiled code 30 are stored in secondary memory 22.

[0015] The operation of the first embodiment will now be described in more detail with reference to FIGS. 2-4. The general concept is to encode first and second bit values, e.g., 0 and 1, as either optimized or non-optimized code in the compiled program. For example, the presence of optimized code could encode the value “1” and the presence of non-optimized code could encode the value “0”.

[0016] Generally, compilers optimize code by using techniques such as constant propagation (replacing expressions that evaluate to a constant with a constant value), copy propagation (replacing assignment by the assigned value) strength reduction (replacing operations by more efficient operations), loop unrolling (replace loop with code), and so on.

[0017] Modern compliers make many choices of methods to optimize code as they are compiling it. A method of watermarking code can be executed by changing the choice of optimizations that the compiler makes.

[0018] For example, take the following C code: for (i=1; i<3; i++){x+=x*i;}. Most compilers would “unroll” this code, producing an optimized object code segment as though the C code had been: x+=x*1; x+=x*2; x+=x*3; thus saving the cost of incrementing i. If instead, the compiler chose to not unroll the loop, the non-optimized code segment would represent one bit of watermarked information.

[0019] The process of watermarking will now be described in more detail with reference to FIGS. 2-4. FIG. 2 includes a first block 30 depicting the program code, a second block 32 depicting the signature data to be encoded as a watermark, a third block 34 depicting the optimized compiled code output by the compiler, and a fourth block 36 depicting the modified compiled code having a the value of the first bit of the signature data encoded therein.

[0020]FIG. 3 is a flowchart depicting the acts performed to encode the signature as a watermark in the compiled code. The code is compiled to generate the compiled code 34 and the first bit of signature data 32, in this example having a value “1”, is accessed. The compiled code is then searched for optimized code that will be used to encode this bit value.

[0021] In this example, the optimized code is depicted in the third block 34. This optimized code is replaced by non-optimized code as depicted in the fourth block 36. The presence of this non-optimized code encodes a bit value of “1” for the first digit in the watermark.

[0022] Subsequently, the second digit, “0”, of the signature is then retrieved. The next instance of an unrolled loop would then be detected. In this case the optimized code would not be replaced by non-optimized code thereby encoding the bit value “0” for the second digit of the signature.

[0023] Thus, the values of the successive bits in the signature would be encoded into the program code as a series of blocks of optimized code and non-optimized code, with presence of optimized code encoding a first bit value and the presence of non-optimized code encoding a second bit value. The program loops until all the bit values in the signature have been encoded as a watermark into the compiled program.

[0024] The watermarked data can be retrieved by examining the data with a watermarking module that understands the compiler's optimization algorithm, and outputs the bits related to its non-optimal choices. This process will now be described with reference to FIG. 4.

[0025] Referring to FIG. 4, the watermarked code is searched for the presence of optimized code or substituted non-optimized code. If non-optimized code is detected then a first bit value is assigned to current digit of the signature and if optimized code is detected then a second bit value is assigned to the current digit of the signature. The program loops until all the selected optimizations and non-optimizations have been decoded.

[0026] In the above example, both bit values were encoded by detecting whether a loop unroll had been optimized. Other optimizations, for example constant replacement, can be utilized in the same manner. Alternatively, a combination of optimizations can be utilized to encode the bit values, for example a loop unroll and constant replacements. The presence of the optimized code encodes one bit value and presence of the non-optimized code encodes the other bit value.

[0027] The invention has now been described with reference to the preferred embodiments. Alternatives and substitutions will now be apparent to persons of ordinary skill in the art. For example, other optimizations than the specific examples described can be utilized to encode the bit values. Additionally, the encoding and decoding processes can be incorporated as part of the compiler or be implemented as independent processes. Accordingly, it is not intended to limit the invention except as provided by the appended claims. 

What is claimed is:
 1. In a computer system, a method for encoding digital data a watermark in executable computer code generated by a compiler, the watermark including a plurality of binary digits, with the compiler replacing a selected non-optimized code segments with a optimized code segments, the method comprising: detecting an optimized code segment; and changing compiler optimization choices to encode a watermark.
 2. In a computer system, a method for encoding digital data a watermark in executable computer code generated by a compiler, the watermark including a plurality of binary digits, with the compiler replacing a first non-optimized code segment with a first optimized code segment, the method comprising: for a current binary digit in the watermark: replacing a first optimized code segment, included in the executable computer code, with a first non-optimized code segment only if the current binary digit has a first value, where the presence of the non-optimized code segment encodes the first value and the presence of optimized code segment encodes the second value of the current binary digit.
 3. The method of claim 2 where the act of replacing further comprises the steps of: encoding a first binary value by replacing a loop unroll optimization with non-optimized code.
 4. The method of claim 2 further comprising the acts of: searching an executable code module for the presence of either the first non-optimized code segment or the first optimized code segment; for a current binary digit: assigning a first bit value if the non-optimized code segment is detected or assigning a second bit value if the non-optimized code segment is detected.
 5. A computer program product including computer readable program code for causing a computer to encode data as a watermark in an executable copy of a computer program generated by a compiler and to decode the watermark to recover the data, with the data including a plurality of digital characters, and with the compiler replacing selected non-optimized code segments with optimized code segments, said computer program product comprising: a computer readable medium having computer readable program code embodied therein, with said computer readable program code further comprising: computer readable encoding program code for causing a computer to detect an optimized code segment; computer readable encoding program code for causing a computer to change compiler optimization choices to encode a watermark.
 6. A computer program product including computer readable program code for causing a computer to encode data as a watermark in an executable copy of a computer program generated by a compiler and to decode the watermark to recover the data, with the data including a plurality of digital characters, and with the compiler replacing selected non-optimized code segments with optimized code segments, said computer program product comprising: a computer readable medium having computer readable program code embodied therein, with said computer readable program code further comprising: computer readable encoding program code for causing a computer to, for a current binary digit in the watermark, replace a first optimized code segment, included in the executable computer code, with a first non-optimized code segment only if the current binary digit has a first value, where the presence of the non-optimized code segment encodes the first value and the presence of optimized code segment encodes the second value of the current binary digit.
 7. The computer program product of claim 6 where the computer program for causing a computer to replace a first optimized code segment further comprises: computer readable encoding program code for causing a computer to encode a first binary value by replacing a loop unroll optimization with non-optimized code.
 8. The computer program product of claim 6 further comprising: computer readable encoding program code for causing a computer to search an executable code module for the presence of either the first non-optimized code segment or the first optimized code segment; computer readable encoding program code for causing a computer to, for a current binary digit, assign a first bit value if the non-optimized code segment is detected or assigning a second bit value if the non-optimized code segment is detected.
 9. In a computer system, a system for encoding data as a watermark in executable computer code generated by a compiler, the data including a plurality of binary digits, with the compiler replacing a selected non-optimized code segments with a optimized code segments, the system comprising: means for detecting an optimized code segment; and means for changing compiler optimization choices to encode a watermark.
 10. In a computer system, a system for encoding a data as a watermark in executable computer code generated by a compiler, the data including a plurality of binary digits, with the compiler replacing a first non-optimized code segment with a first optimized code segment, the system comprising: for a current binary digit in the watermark: means for replacing a first optimized code segment, included in the executable computer code, with a first non-optimized code segment only if the current binary digit has a first value, where the presence of the non-optimized code segment encodes the first value and the presence of optimized code segment encodes the second value of the current binary digit.
 11. The system of claim 10 further comprising: means for searching an executable code module for the presence of either the first non-optimized code segment or the first optimized code segment; for a current binary digit in a watermark: means for assigning a first bit value if the non-optimized code segment is detected or assigning a second bit value if the non-optimized code segment is detected. 